A Modi ed Principal Component Technique Based on the LASSO

نویسندگان

  • Ian T. JOLLIFFE
  • Nickolay T. TRENDAFILOV
  • Mudassir UDDIN
  • Ian T. Jolliffe
چکیده

In many multivariate statistical techniques, a set of linear functions of the original p variables is produced. One of the more difŽ cult aspects of these techniques is the interpretation of the linear functions, as these functions usually have nonzero coefŽ cients on all p variables. A common approach is to effectively ignore (treat as zero) any coefŽ cients less than some threshold value, so that the function becomes simple and the interpretation becomes easier for the users. Such a procedure can be misleading.There are alternatives to principal componentanalysiswhich restrict the coefŽ cients to a smaller number of possible values in the derivationof the linear functions,or replace the principalcomponentsby “principal variables.” This article introduces a new technique, borrowing an idea proposed by Tibshirani in the context of multiple regressionwhere similar problems arise in interpreting regression equations. This approach is the so-called LASSO, the “least absolute shrinkage and selection operator,” in which a bound is introduced on the sum of the absolute values of the coefŽ cients, and in which some coefŽ cients consequentlybecome zero. We explore some of the propertiesof the new technique,both theoreticallyand using simulationstudies, and apply it to an example.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Differenced-Based Double Shrinking in Partial Linear Models

Partial linear model is very flexible when the relation between the covariates and responses, either parametric and nonparametric. However, estimation of the regression coefficients is challenging since one must also estimate the nonparametric component simultaneously. As a remedy, the differencing approach, to eliminate the nonparametric component and estimate the regression coefficients, can ...

متن کامل

An Iterative Nonlinear Gaussianization Algorithm for Image Simulation and Synthesis

We propose an Iterative Nonlinear Gaussianization Algorithm (INGA) which seeks a nonlinear map from a set of dependent random variables to independent Gaussian random variables. A direct motivation of INGA is to extend principal component analysis (PCA), which transforms a set of correlated random variables into uncorrelated (independent up to second order) random variables, and Independent Com...

متن کامل

Texture segmentation through eigen-analysis of the Pseudo-Wigner distribution

In this paper we propose a new method for texture segmentation based on the use of texture feature detectors derived from a decorrelation procedure of a modi®ed version of a Pseudo-Wigner distribution (PWD). The decorrelation procedure is accomplished by a cascade recursive least squared (CRLS) principal component (PC) neural network. The goal is to obtain a more ecient analysis of images by c...

متن کامل

On General Adaptive Sparse Principal Component Analysis

The method of sparse principal component analysis (S-PCA) proposed by Zou, Hastie, and Tibshirani (2006) is an attractive approach to obtain sparse loadings in principal component analysis (PCA). S-PCA was motivated by reformulating PCA as a least-squares problem so that a lasso penalty on the loading coefficients can be applied. In this article, we propose new estimates to improve S-PCA in the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006